Over- and Under-sampling Approach for Extremely Imbalanced and Small Minority Data Problem in Health Record Analysis
نویسندگان
چکیده
منابع مشابه
The Imbalanced Training Sample Problem: Under or over Sampling?
The problem of imbalanced training sets in supervised pattern recognition methods is receiving growing attention. Imbalanced training sample means that one class is represented by a large number of examples while the other is represented by only a few. It has been observed that this situation, which arises in several practical domains, may produce an important deterioration of the classificatio...
متن کاملSafe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem
The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regio...
متن کاملBorderline over-sampling for imbalanced data classification
Traditional classification algorithms, in many times, perform poorly on imbalanced data sets in which some classes are heavily outnumbered by the remaining classes. For this kind of data, minority class instances, which are usually much more of interest, are often misclassified. The paper proposes a method to deal with them by changing class distribution through oversampling at the borderline b...
متن کاملBlending Propensity Score Matching and Synthetic Minority Over-sampling Technique for Imbalanced Classification
Real world data sets often contain disproportionate sample sizes of observed groups making the task of prediction algorithms very difficult. One of the many ways to combat inherit bias from class imbalance data is to perform re-sampling. In this paper we discuss two popular re-sampling approaches proposed in literature, Synthetic Minority Over-sampling Technique (SMOTE) and Propensity Score Mat...
متن کاملAn Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique
Abstract—Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalance...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Public Health
سال: 2020
ISSN: 2296-2565
DOI: 10.3389/fpubh.2020.00178